Claims 



What is claimed is: 

1. A method ^or tracking an object of interest in a video 
processing system, /the method comprising the steps of: 

generatimg for a given measurement interval an audio 
locator output and! a video locator output, each indicative of a 
location of the oftject of interest; 

applying a set of rules to determine a manner in which at 
least one of the /audio locator output and the video locator output 
will be utilizen to adjust a setting of the camera based on the 
given measurement interval; and 

adjufsting the camera setting in accordance with the 
determined manner of utilization. 



2. The/ method of claim 1 wherein the object of interest 
comprises a moving person. 



3. Thfe method of claim 1 wherein the camera is a pan-tilt- 
zoom (PTZ) camera having adjustable pan, tilt and zoom settings. 

4. The method of claim 1 wherein the set of rules includes 



• determinincr 
sufficient 
Utilizing 
setting if 
specified 



if the audio locator and video locator outputs are 
;.y close for the given measurement interval, and 
only the audio locator output to adjust the camera 
the audio and video locator outputs are not within a 
range of one another for the given measurement interval. 



'he method of claim 4 wherein the set of rules further 
includes jitilizing the video locator output to adjust the camera 
setting oily if the audio and video locator outputs are within a 
specified range of one another for the given measurement interval, 
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6. The methbd of claim 5 wherein the set of rules further 
includes deteiniiiniJng if a confidence indicator associated with the 
video locator output is above a specified video locator threshold 
for the given measi(rem^t interval, and utilizing the video locator 
output to adjustf thfe^x/amera setting only if the video locator 
confidence indicator is above the video locator threshold for the 
given measuremerjt interval. 



7. The method of /claim 1 wherein the set of rules includes 
determining based on tpe audio locator output if the object of 
interest corresponds tfc a new speaker in a multiple-participant 
system, and if a new speaker is detected, directing the camera to 
zoom out by a predeter nined amount and to turn in a direction of 



the new speaker, 



8, The method of claim 1 wherein the set of rules includes 
determining based on /the audio locator output if the object of 
interest corresponds /to a same speaker in a multiple-participant 
system, and if a same speaker is detected, utilizing the video 
locator output to ad/just the camera setting so as place the same 
speaker at a designated position within one or more video frames 
generated by the camera. 



9. The methofci of claim 8 wherein the set of rules further 
includes adjusting /a zoom setting of the camera until a head of the 
identified same speaker occupies a designated portion of a given 
one of the one or/ more video frames generated by the camera. 



10. The metthod of claim 1 wherein the set of rules specifies 

that the camera lis zoomed out by a predetermined amount after a 

detected period/ of continued silence exceeds a first amount of 
time • 
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11. The m 
specifies that 
the detected pe 
time greater tl: 



2thod of claim 10 wherein the set of rules further 
:he camera is zoomed out by an additional amount if 
iod of continued silence exceeds a second amount of 
an the first amount of time. 



12 . An cl 
video processi 



pparatus for tracking an object of interest in a 
g system, the apparatus comprising: 
a cajtiera; and 

a prjocessor coupled to the camera and operative (i) to 
process an audio locator output and a video locator output, each 
indicative of a location of the object of interest for a given 
measurement interval; and (ii) to apply a set of rules to determine 
a manner in whkch at least one of the audio locator output and the 
video locator output will be utilized to adjust a setting of the 
camera based pn the given measurement interval, such that the 
camera setting! is adjusted in accordance with the determined manner 
of utilizatioi 



13. The (apparatus of claim 12 wherein the object of interest 
comprises a mpving person. 

14. Th^ apparatus of claim 12 wherein the camera is a pan- 
tilt-zoom (|*TZ) camera having adjustable pan, tilt and zoom 
settings . 



15. The apparatus of claim 12 wherein the set of rules 
includes determining if the audio locator and video locator outputs 
are suf f icilently close for the given measurement interval, and 
utilizing only the audio locator output to adjust the camera 
setting if fche audio and video locator outputs are not within a 
specified range of one another for the given measurement interval. 
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Ppparatus of claim 15 wherein the set of rules 
utilizing the video locator output to adjust the 
nly if the audio and video locator outputs are 
5d range of one another for the given measurement 



17. /The apparatus of claim 16 wherein the set of rules 
further injbludes determining if a confidence indicator associated 
with the.^'^ftTle^^ locator output is above a specified video locator 
thresholNd/ for_tne given measurement interval, and utilizing the 
video locfet<^ output to adjust the camera setting only if the video 
locator (Confidence indicator is above the video locator threshold 
for the (|iven measurement interval, 

18. The apparati^ of claim 12 wherein the set of rules 
includes determining based on the audio locator output if the 
object of interest carresponds to a new speaker in a multiple- 
participant system, arjjd if a new speaker is detected, directing the 
camera to zoom out Hy ^ predetermined amount and to turn in a 
direction of the new/ speaker. 

19. The appatatus of claim 12 wherein the set of rules 
includes determining based on the audio locator output if the 
object of interest^ corresponds to a same speaker in a multiple- 
participant system, and if a same speaker is detected, utilizing 
the video locator output to adjust the camera setting so as place 
the same speaker/ at a designated position within one or more video 
frames generated by the camera. 



20. Th^ apparatus of claim 19 wherein the set of rules 
further incliides adjusting a zoom setting of the camera until a 
head of the Identified same speaker occupies a designated portion 
of a given qne of the one or more video frames. 
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21. \ The apparatus of claim 12 wherein the set of rules 
specifies that the camera is zoomed out by a predetermined amount 
after a detected period of continued silence exceeds a first amount 
of time. 1 

22. 1 The apparatus of claim 21 wherein the set of rules 
further specifies that the camera is zoomed out by an additional 
amount if jthe detected period of continued silence exceeds a second 
amount of Itime greater than the first amount of time. 

23. An. article of manufacture comprising a storage mediiom for 
storing one or more programs for tracking an object of interest in 
a video processing system, wherein the one or more programs when 
executed bM a processor implement the steps of: 

generating for a given measurement interval an audio 
locator outiput and a video locator output, each indicative of a 
location ofl the object of interest; 

applying a set of rules to determine a manner in which at 
least one ofl the audio locator output and the video locator output 
will be utilized to adjust a setting of the camera based on the 
-given measurement interval; and 

adjVisting the camera setting in accordance with the 
determined mamner of utilization. 
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